7 research outputs found
The Stochastic Score Classification Problem
Consider the following Stochastic Score Classification Problem. A doctor is assessing a patient\u27s risk of developing a certain disease, and can perform n tests on the patient. Each test has a binary outcome, positive or negative. A positive result is an indication of risk, and a patient\u27s score is the total number of positive test results. Test results are accurate. The doctor needs to classify the patient into one of B risk classes, depending on the score (e.g., LOW, MEDIUM, and HIGH risk). Each of these classes corresponds to a contiguous range of scores. Test i has probability p_i of being positive, and it costs c_i to perform. To reduce costs, instead of performing all tests, the doctor will perform them sequentially and stop testing when it is possible to determine the patient\u27s risk category. The problem is to determine the order in which the doctor should perform the tests, so as to minimize expected testing cost. We provide approximation algorithms for adaptive and non-adaptive versions of this problem, and pose a number of open questions
Follow Your Star: New Frameworks for Online Stochastic Matching with Known and Unknown Patience
We study several generalizations of the Online Bipartite Matching problem. We
consider settings with stochastic rewards, patience constraints, and weights
(both vertex- and edge-weighted variants). We introduce a stochastic variant of
the patience-constrained problem, where the patience is chosen randomly
according to some known distribution and is not known until the point at which
patience has been exhausted. We also consider stochastic arrival settings
(i.e., online vertex arrival is determined by a known random process), which
are natural settings that are able to beat the hard worst-case bounds of more
pessimistic adversarial arrivals.
Our approach to online matching utilizes black-box algorithms for matching on
star graphs under various models of patience. In support of this, we design
algorithms which solve the star graph problem optimally for patience with a
constant hazard rate and yield a 1/2-approximation for any patience
distribution. This 1/2-approximation also improves existing guarantees for
cascade-click models in the product ranking literature, in which a user must be
shown a sequence of items with various click-through-rates and the user's
patience could run out at any time.
We then build a framework which uses these star graph algorithms as black
boxes to solve the online matching problems under different arrival settings.
We show improved (or first-known) competitive ratios for these problems.
Finally, we present negative results that include formalizing the concept of a
stochasticity gap for LP upper bounds on these problems, bounding the
worst-case performance of some popular greedy approaches, and showing the
impossibility of having an adversarial patience in the product ranking setting.Comment: 43 page
Approximating Two-Stage Stochastic Supplier Problems
The main focus of this paper is radius-based (supplier) clustering in the two-stage stochastic setting with recourse, where the inherent stochasticity of the model comes in the form of a budget constraint. We also explore a number of variants where additional constraints are imposed on the first-stage decisions, specifically matroid and multi-knapsack constraints.
Our eventual goal is to provide results for supplier problems in the most general distributional setting, where there is only black-box access to the underlying distribution. To that end, we follow a two-step approach. First, we develop algorithms for a restricted version of each problem, in which all possible scenarios are explicitly provided; second, we employ a novel scenario-discarding variant of the standard Sample Average Approximation (SAA) method, in which we crucially exploit properties of the restricted-case algorithms. We finally note that the scenario-discarding modification to the SAA method is necessary in order to optimize over the radius
Agent Environment Cycle Games
Partially Observable Stochastic Games (POSGs) are the most general and common
model of games used in Multi-Agent Reinforcement Learning (MARL). We argue that
the POSG model is conceptually ill suited to software MARL environments, and
offer case studies from the literature where this mismatch has led to severely
unexpected behavior. In response to this, we introduce the Agent Environment
Cycle Games (AEC Games) model, which is more representative of software
implementation. We then prove it's as an equivalent model to POSGs. The AEC
games model is also uniquely useful in that it can elegantly represent both all
forms of MARL environments, whereas for example POSGs cannot elegantly
represent strictly turn based games like chess.Comment: This work of this paper has been merged into the paper "PettingZoo:
Gym for Multi-Agent Reinforcement Learning" arXiv:2009.1447